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Abstract. We initiate tlie study of probabilistic parallel programs with dynamic 
process creation and synchronisation. To this end, we introduce probabilistic 
split-join systems (pSJSs), a model for parallel programs, generalising both prob- 
abilistic pushdown systems (a model for sequential probabilistic procedural pro- 
grams which is equivalent to recursive Markov chains) and stochastic branching 
processes (a classical mathematical model with applications in various areas such 
as biology, physics, and language processing). Our pSJS model allows for a pos- 
sibly recursive spawning of parallel processes; the spawned processes can syn- 
chronise and return values. We study the basic performance measures of pSJSs, 
especially the distribution and expectation of space, work and time. Our results 
extend and improve previously known results on the subsumed models. We also 
show how to do performance analysis in practice, and present two case studies 
illustrating the modelling power of pSJSs. 



1 Introduction 

The verification of probabilistic programs with possibly recursive procedures has been 
intensely studied in the last years. The Markov chains or Markov Decision Processes 
underlying these systems may have infinitely many states. Despite this fact, which pre- 
vents the direct application of the rich theory of finite Markov chains, many positive 
results have been obtained. Model-checking algorithms have been proposed for both 
linear and branching temporal logics [12,16,26], algorithms deciding properties of 
several kinds of games have been described (see e.g. [15]), and distributions and ex- 
pectations of performance measures such as run-time and memory consumption have 
been investigated [13,5,6]. 

In all these papers programs are modelled as probabilistic pushdown systems 
(pPDSs) or, equivalently [10], as recursive Markov chains. Loosely speaking, a pPDS 
is a pushdown automaton whose transitions carry probabilities. The configurations of 
a pPDS are pairs containing the current control state and the current stack content. In 
each step, a new configuration is obtained from its predecessor by applying a transition 
rule, which may modify the control state and the top of the stack. 

The programs modelled by pPDSs are necessarily sequential: at each point in time, 
only the procedure represented by the topmost stack symbol is active. Recursion, how- 
ever, is a useful language feature also for multithreaded and other parallel programming 
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languages, such as Cilk and JCilk, which allow, e.g., for a natural parallehsation of 
divide-and-conquer algorithms [8, 9]. To model parallel programs in probabilistic sce- 
narios, one may be tempted to use stochastic multitype branching processes, a classical 
mathematical model with applications in numerous fields including biology, physics 
and natural language processing [18,2]. In this model, each process has a type, and 
each type is associated with a probability distribution on transition rules. For instance, 

2/3 1/3 1 

a branching process with the transition rules X '■ — )■ {}, X = — > {X, F}, Y ^ {X} 
can be thought of describing a parallel program with two types of processes, X and Y. 
A process of type X terminates with probability 2/3, and with probability 1/3 stays 
active and spawns a new process of type Y. A process of type Y changes its type to X. 
A configuration of a branching process consists of a pool of currently active processes. 
In each step, all active processes develop in parallel, each one according to a rule which 
is chosen probabilistically. For instance, a step transforms the configuration {XY) into 
(XYX) with probability i • 1, by applying the second X-rule to the X-process and, in 
parallel, the F-rule to the F-process. 

Branching processes do not satisfactorily model parallel programs, because they 
lack two key features: synchronisation and returning values. In this paper we introduce 
probabilistic split-join systems (pSJSs), a model which offers these features. Parallel 
spawns are modelled by rules of the form X ^ (YZ). The spawned processes Y 
and Z develop independently; e.g., a rule Y ^ Y' may be applied to the F-process, re- 
placing Y by Y'. When terminating, a process enters a synchronisation state, e.g. with 
rules Y' q and Z r (where q and r are synchronisation states). Once a process 
terminates in a synchronisation state, it waits for its sibling to terminate in a synchroni- 
sation state as well. In the above example, the spawned processes wait for each other, 
until they terminate in q and r. At that point, they may join to form a single process, 
e.g. with a rule (qr) ^ W. So, synchronisation is achieved by the siblings waiting for 
each other to terminate. All rules could be probabilistic. Notice that synchronisation 
states can be used to return values; e.g., if the F-process returns q' instead of q, this can 
be recorded by the existence of a rule (q'r) M- W, so that the resulting process (i.e., 
W or W') depends on the values computed by the joined processes. For the notion of 
sibhngs to make sense, a configuration of a pSJS is not a set, but a binary tree whose 
leaves are process symbols (such as X, Y, Z) or synchronisation states (such as q, r). A 
step transforms the leaves of the binary tree in parallel by applying rules; if a leaf is not 
a process symbol but a synchronisation state, it remains unchanged unless its sibhng 
is also a synchronisation state and a joining rule (such as {qr) ^ W) exists, which 
removes the sibhngs and replaces their parent node with the right hand side. 

Related work. The probabilistic models closest to ours are pPDSs, recursive Markov 
chains, and stochastic branching processes, as described above. The non-probabihstic 
(i.e., nondeterministic) version of pSJSs (SJSs, say) can be regarded as a special case 
of ground tree rewriting systems, see [20] and the references therein. A configuration 
of a ground tree rewriting system is a node-labelled tree, and a rewrite rule replaces 
a subtree. The process rewrite system (PRS) hierarchy of [22] features sequential and 
parallel process composition. Due to its syntactic differences, it is not obvious whether 
SJSs are in that hierarchy. They would be above pushdown systems (which is the se- 
quential fragment of PRSs), because SJSs subsume pushdown systems, as we show in 
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Section 3.1 for the probabilistic models. Dynamic pushdown networks (DPNs) [4] are a 
parallel extension of pushdown systems. A configuration of a DPN is a list of configu- 
rations of pushdown systems rurming in parallel. DPNs feature the spawning of parallel 
threads, and an extension of DPNs, called constrained DPNs, can also model joins via 
regular expressions on spawned children. The DPN model is more powerful and more 
compUcated than SJSs. All those models are non-probabilistic. 

Organisation of the paper. In Section 2 we formally define our model and provide fur- 
ther preliminaries. Section 3 contains our main results: we study the relationship be- 
tween pSJSs and pPDSs (Section 3.1), we show how to compute the probabilities for 
termination and finite space, respectively (Sections 3.2 and 3.3), and investigate the dis- 
tribution and expectation of work and time (Section 3.4). In Section 4 we present two 
case studies illustrating the modelling power of pSJSs. We conclude in Section 5. All 
proofs are provided in the appendix. 

2 Preliminaries 

For a finite or infinite word w, we write ?«(0), . . . to refer to its individual letters. 
We assume throughout the paper that B is a fixed infinite set of basic process symbols. 
We use the symbols ' (' and ')' as special letters not contained in B. For an alphabet E, 
we write {SS) to denote the language {((Ti(T2) | cri,cr2 G S} and E^'"^ to denote 
S U {SS). To a set U we associate a set T{S) of binary trees whose leaves are la- 
belled with elements of S. Formally, T{S) is the smallest language that contains E 
and (T(i:)T(i:)). For instance, {{aa)a) € T{{a}). 

Definition 1 (pSJS). Let Q be a finite set of synchronisation states disjoint from B 
and not containing '(' or ')'. Let F be a finite set of process symbols, such that F C 
B U {QQ). Define the alphabet E := FuQ. Let 6 C F x E^''^ be a transition relation. 
Let Prob : (5 — > (0, 1] be a function so that for all a ^ F we have X^a^aei P^^i'^ ^ 
a) = 1. Then the tuple S = {F, Q, S, Prob) is a probabiUstic spUt-join system (pSJS). 
A pSJS with F n (QQ) = is called branching process. 

We usually write a ^ a instead of Prob{a ^ a) = p. For technical reasons we 
allow branching processes of "degree 3", i.e., branching processes where E^'"^ may be 
extended to E^'"^'^ := E^'"^ U {(o-icr2(T3) | ci, (72, as € E}. In branching processes, it 
is usually sufficient to have \Q\ = 1. 

A Markov chain is a stochastic process that can be described by a triple 
Ad = (D, — !>, Prob) where D is a finite or countably infinite set of states, —> C D x D 
is a transition relation, and Prob is a function which to each transition s — > t 
of M assigns its probability Prob{s t) > so that for every s G D we have 
X^__^j Prob{s ^ t) — 1 (as usual, we write s ^ t instead of Prob{s ^ t) ~ x). 
A path (or run) in M is a finite (or infinite, resp.) word u S £>+ U D'^, such that 
u{i—l) u{i) for every 1 < z < l^l. The set of all runs that start with a given path u 
is denoted by Run[M]{u) (or Run(u), if M is understood). To every s G D we as- 
sociate the probabihty space {Run{s), T , V) where T is the cr-field generated by all 
basic cylinders Run{u) where u is a path starting with s, and V : ^ [0,1] is the 
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unique probability measure such that P{Run{u)) = where u{i—l) ^ u{i) 

for every 1 < z < Only certain subsets of Run{s) are T'-measurable, but in this 
paper we only deal with "safe" subsets that are guaranteed to be in If Xs is a ran- 
dom variable over Run{s), we write E [X,,] for its expectation. For s,t D, we define 
Runislt) := {w e Run{s) \ 3i > : w{i) = t} and [4*] := V {Run{sit)). 

To a pSJS S = (r, Q, 5, Prob) with alphabet S = F IJ Q we associate a Markov 
chain Ms with T{S) as set of states. For t G T{S), we define Front{t) — oi, . . . , Ofe 
as the unique finite sequence of subwords of t (read from left to right) with ai E F for 
all 1 < i < fc. We write \Front{t)\ = fc. If fc = 0, then t is called terminal. The Markov 

P Pi 

chain Ms has a transition t — )• t' , if: Frontit) = ai, . . . ,ak\ ai > ai are transitions 
in 5 for alH; is obtained from t by replacing a, with for all i; and p = HiLi Pi- 

Note that f ^ if Hs terminal. For branching processes of degree 3, the set T{E) is 
extended in the obvious way to trees whose nodes may have two or three children. 

Denote by a random variable over Run{a) where Ta{w) is either the least i G N 
such that w{i) is terminal, or oo, if no such i exists. Intuitively, Ta{w) is the number 
of steps in which w terminates, i.e., the termination time. Denote by Wfr a random 
variable over Run{a) where Wa{w) X^i^o \Front{w{i))\. Intuitively, Wrj{w) is 
the total work in w. Denote by Sa a random variable over Run{a) where So.(w) := 
supj^Q and \w{i)\ is the length of w{i) not counting the symbols '(' and ')'. In- 

tuitively, Sct(w) is the maximal number of processes during the computation, or, short, 
the space of w. 

Example 2. Consider the pSJS with F = {X, {qr)} and Q = {q, r} and the transitions 

X {XX), X ^ q, X r, (qr) A X.Letu = X {XX) {qr) X q q. 

Then u is a path, because we have X {^^) °''^^> {qr) X q ^ 

q. Note that q is terminal. The set Run{u) contains only one run, namely w := 
u(0)u(l)u(2)u(3)u(4)u(4) • • • . We have V {Run{u)) = 0.5 • 0.06 • 0.3, and Tx{w) = 
4, Wx (w) = 5, and Sx {w) — 2. The dags in Figure 1 graphically represent this run (on 
the left), and another example run (on the right) with Tx = 3, Wx = 5, and Sx = 3. 




Fig. 1. Two terminating runs 
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Remark 3. Our definition of pSJSs may be more general than needed from a modelling 
perspective: e.g., our rules allow for both synchronisation and splitting in a single step. 
We choose this definition for technical convenience and to allow for easy comparisons 
withpPDSs (Section 3.1). 

The complexity-theoretic statements in this paper are with respect to the size of 
the given pSJS S = {r,Q,5, Prob), which is defined as |r| + \Q\ + \d\ + \Prob\, 
where |Pro6| equals the sum of the sizes of the binary representations of the values 
of Prob. A formula of ExTh{R), the existential fragment of the first-order theory of 
the reals, is of the form 3xi . . . 3xmR{xi, . . . , where R{xi, . . . , x„) is a boolean 
combination of comparisons of the formp(xi, . . . , .t„) ^ 0, where p{xi, . . . , a;„) is a 
multivariate polynomial and ~ £ {<, >, <, >, =, ^}. The validity of closed formulas 
(to = n) is decidable in PSPACE [7, 23]. We say that one can efficiently express a value 
c G R associated with a pSJS, if one can, in polynomial space, construct a formula 4){x) 
in ExTh{R) of polynomial length such that x is the only free variable in (j){x), and 4>{x) 
is true if and only if a; = c. Notice that if c is efficiently expressible, then c ~ r for 
T G Q is decidable in PSPACE for ~ G {<, >, <, >, =, 7^}. 

For some lower bounds, we prove hardness (with respect to P-time many-one reduc- 
tions) in terms of the PosSLP decision problem. The PosSLP (Positive Straight-Line 
Program) problem asks whether a given straight-line program or, equivalently, arith- 
metic circuit with operations -|-, — , •, and inputs and 1, and a designated output gate, 
outputs a positive integer or not. PosSLP is in PSPACE. More precisely, it is known 
to be on the 4th level of the Counting Hierarchy [ 1 ] ; it is not known to be in NP. The 
PosSLP problem is a fundamental problem for numerical computation; it is complete 
for the class of decision problems that can be solved in polynomial time on models with 
unit-cost exact rational arithmetic, see [1, 16] for more details. 

3 Results 

3.1 Relationship with probabilistic pushdown systems (pPDSs) 

We show that pSJSs subsume pPDSs. A probabilistic pushdown system (pPDS) [12, 13, 
5, 6] is a tuple S = {F, Q, S, Prob), where r' is a finite stack alphabet, Q is a finite 
set of control states, SCQxFxQx F^"^ (where = [a £ F* , \a\ < 2}) 
is a transition relation, and Prob : ^ — >■ (0, 1] is a function so that for all g G Q and 
a G r' we have Y^qa'^ra Pfob{qa ra) = 1. One usually writes qa A ra instead 
of Prob{qa ra) = p. To a pPDS S = {F, Q, S, Prob) one associates a Markov chain 

1 p 
Ms with Q X F* as set of states, and transitions q ^ qfoi aR q € Q, and qa^ rafi 

for all qa A- ra and all P e F*. 

A pPDS Sp with Fp as stack alphabet, Qp as set of control states, and transi- 

p 

tions can be transformed to an equivalent pSJS S: Take Q := Qp U Fp as syn- 
chronisation states; F := {{qa) \ q G Qp, a G Fp} as process symbols; and transitions 

p p p p p 

(qa) ^ ((^&)c) for all qa ^p rbc, {qa) ^ {rb) for all qa ^p rb, and {qa) ^ r 
p 

for all 50 ^p r. The Markov chains Msp and Ms are isomorphic. Therefore, we 
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occasionally say that a pSJS is a pPDS, if it can be obtained from a pPDS by this trans- 
formation. Observe that in pPDSs, we have T = 'N^^ because there is no parallelism. 

Conversely, a pSJS S with alphabet S = FUQ can be transformed into a pPDS Sp 
by "seriaUsing" S: Take Qp := {□} LI {q \ q £ Q} as control states; Fp := TUQU 

{q \ q G Q} as stack alphabet; and transitions Da ^p Oaia2 for all a (criC72), 

P P 1 

□a ^p Da for all a ct with a £ S \ {QQ), and Dg g for all g G Q, and 

qa ^p □crq for all q £ Q and a £ S, and rq ^p D (qr) for all q,r G Q. The Markov 
chains Ms and Mgj, are not isomorphic. However, we have: 

Proposition 4. There is a probability-preserving bijection between the runs Run(al.q) 
in Ms and the runs RuniDa]^) in Msp. In particular, we have \<j\.q\ = [ncr^g]. 

For example, the pSJS run on the left side of Figure 1 corresponds to the pPDS run 

DX -54 DXX DqX h qX ^ DXq Drq h rq h D{qr) ^ UX ^ 

r-i 1-1-1 

Dq^ q ^ q^ ... 

3.2 Probability of Termination 

We call a run terminating, if it reaches a terminal tree. Such a tree can be a single syn- 
chronisation state (e.g., q on the left of Figure 1), or another terminal tree (e.g., {q{rq)) 
on the right of Figure 1). For any a & IJ,we denote by [ai] the termination probability 
when starting in a; i.e., [al] = J2t is temunaiWi'f']- transform any pSJS S into 

a pSJS S' such that whenever a run in S terminates, then a corresponding run in S' 
terminates in a synchronisation state. This transformation is by adding a fresh state q, 

and transitions (rs) q for all r,s G Q with (rs) ^ F, and (qr) ^ q and {rq) ^ q 
for all r € Q.lt is easy to see that this keeps the probability of termination unchanged, 
and modifies the random variables T„ and by at most a factor 2. Notice that the 
transformation can be performed in polynomial time. After the transformation we have 
[a\.] = ^q^glo'iq]- A pSJS which satisfies this equahty will be called normalised in 
the following. From a modelling point of view, pSJSs may be expected to be normalised 
in the first place: a terminating program should terminate all its processes. 

We set up an equation system for the probabilities [crlq]. For each a € S and 
q G Q, the equation system has a variable of the form la^qj and an equation of the 
form lalqj = fiaiql ^ where /i^j.^] is a multivariate polynomial with nonnegative co- 
efficients. More concretely: If q € Q, then we set fqiq} = 1; if r G Q \ {q}, then we 
set Irlql = 0;if a € F, then we set 

p p 

iqiq2)ern{QQ} a'eE\{QQ) 

Proposition 5. Let a £ S and q £ Q. Then [alq] is the value for falq} in the least 
(w.r.t componentwise ordering) nonnegative solution of the above equation system. 

One can efficiently approximate [crlq] by applying Newton's method to the fixed- 
point equation system from Proposition 5, cf. [16]. The convergence speed of New- 
ton's method for such equation systems was recently studied in detail [1 1]. The simpler 
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"Kleene" method (sometimes called "fixed-point iteration") often suffices, but can be 
much slower. In the case studies of Section 4, using Kleene for computing the termina- 
tion probabiUties up to machine accuracy was not a bottleneck. The following theorem 
essentially follows from similar results for pPDSs: 

Theorem 6 (cf. [14, 16]). Consider a pSJS with alphabet S = FU Q. Let a G S and 
q & Q. Then (1) one can efficiently express ( in the sense defined in Section 2) the value 
of [criq], (2) deciding whether [alq] = is in P, and (3) deciding whether [aiq] < 1 is 
PosShP-hard even for pPDSs. 

3.3 Probability of Finite Space 

A run w G Run{a) is either (i) terminating, or (ii) nonterminating with < oo, or 
(iii) nonterminating with 8^ = 00. From a modelUng point of view, some programs 
may be considered incorrect, if they do not terminate with probability 1. As is well- 
known, this does not apply to programs like operating systems, network servers, system 
daemons, etc., where nontermination may be tolerated or desirable. Such programs may 
be expected not to need an infinite amount of space; i.e., should be finite. 

Given a pSJS S with alphabet S = ruQ,wc show how to construct, in polynomial 
time, a normahsed pSJS 5 with alphabet E = F U Q D E where Q = QU {q} for a 
fresh synchronisation state q, and P {Sa < oo = Ta | Run{a)) = [a|g] for all a G F. 
Having done that, we can compute this probability according to Section 3.2. 

For the construction, we can assume w.l.o.g. that 5 has been normalised using the 
procedure of Section 3.2. Let := {a e T | Vn G N : T' (S,, > n) > 0}. 

Lemma 7. The set U can be computed in polynomial time. 

Let B := {a & F \U \ \fq & Q : [a\,q\ = 0}, so B is the set of process symbols 

a that are both "bounded above" (because a ^ U) and "bounded below" (because a 

cannot terminate). By Theorem 6 (2) and Lemma 7 we can compute B in polynomial 

time. Now we construct S by modifying S as follows: we set Q := QU {q} for a fresh 

synchronisation state q; we remove all transitions with symbols 6 G i? on the left hand 

1 _ 1 _ 

side and replace them with a new transition 6 we add transitions (5192) q for 

all qi,q2 € Q with q G {gi, 92}- We have the following proposition. 

Proposition 8. (1) The pSJS S is normalised; (2) the value [al.q] for a & F and q G Q 
is the same in S and S; (3) we have V (S^ < 00 = | Run{a)) = [al.q]forall a € F. 

Proposition 8 allows for the following theorem. 

Tlieorem9. Consider a pSJS with alphabet S — F {J Q and a € F. Let s := 
V (Sa < 00). Then (1) one can efficiently express s, (2) deciding whether s = is 
in P, and (3) deciding whether s <lis PosSLP-hard even for pPDSs. 

Theorem 9, applied to pPDSs, improves Corollary 6.3 of [13]. There it is shown for 
pPDSs that comparing V (Sa < 00) with t G Q is in EXPTIME, and in PSPACE if 
T G {0, 1}. With Theorem 9 we get PSPACE for r G Q, and P for r = 0. 
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3.4 Work and Time 



We show how to compute the distribution and expectation of work and time of a given 
pSJS S with alphabet E = ruQ. 

Distribution. For a £ S and q € Q, let T„iq{k) := V {Run{aiq), T„ = k \ Run{a)). 
It is easy to see that, for > 1 and a £ F and q €Q,we have 

> £i,£2/3>0 

(gi92>ern(QQ) 

p 

This allows to compute the distribution of time (and, similarly, work) using dy- 
namic programming. In particular, for any k, one can compute aiq{k) ■= 
P{T,>k\ Run{aiq)) = 1 - [^j E-=o T.u{k). 

Expectation. For any random variable Z taking positive integers as value, it holds = 
Sfclo > k). Hence, one can approximate E [Xr | Run{al.q)] = ^2^=0 ~^ ciq{k) 
by computing X]i=o <^iq (k) for large i. In the rest of the section we show how to 
decide on the finiteness of expected work and time. It follows from Propositions 10 
and 1 1 below that the expected work E [W^ \ Run{al.q)] is easier to compute: it is the 
solution of a linear equation system. 

We construct a branching process S with process symbols F — { (^aq\j \ a F, q € 
Q, [alq] > 0}, synchronisation states Q — {_L} and transitions as follows. For nota- 
tional convenience, we identify _L and ^qq\j for all q £ Q. For (\aq\i G F, we set 

- M ^^^^^ {(\<JiqiUa2q2miq2)q\l) foriill a ^{aia2) and {qiq2) S rn{QQ), 
where y:= p ■ [criiqi] ■ [(T2i(72] • [('?i'Z2)l'?] > ; 

y/[aX.q] p 

- (\aq\) ' > (\a'q\) for all a ^ a' with a' £ S\ {QQ), where y:=p- [a'iq] > . 

The following proposition (inspired by a statement on pPDSs [6]) hnks the distribu- 
tions of and conditioned under termination in q with the distributions of ^(t^d 
andT^^^gj. 

Proposition 10. Let a G E and q G Q with [aiq] > 0. Then 

V (Wr = n I Run{a\.q)) = V (^o-gD = n \ Run{i\aq\))) for all n > and 
■P (Xt < n I Run{aiq)) < V (T(|^gD < n \ Run{l\aq\i)) for all n > 0. 

In particular, we have [l\aq\)]^] = 1. 

Proposition 10 allows us to focus on branching processes. For X G F and a finite 
sequence cti, . . . , afe with ai G S, define \(Ji, ■ ■ ■ ,(Tk\x '■= \{i \ ^ < i < k, Oi = 
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i.e., the number of X-symbols in the sequence. We define the characteristic matrix 
A G R^^^ of a branching process by setting 

Ax,Y ■■= ^ p-\o-l,a2,(T3\Y + ^ p-\(Tl,a2\Y+ ^ P ■ \(Ti\y ■ 

It is easy to see that the {X, F)-entry of A is the expected number of F-processes after 
the first step, if starting in a single X-process. If 5' is a branching process and Xq S F, 
we call the pair {S, Xq) a reduced branching process, if for all X e T there is i e N 
such that {A^)xo,x > 0. Intuitively, {S, Xq) is reduced, if, starting in Xq, all process 
symbols can be reached with positive probabihty. If {S, Xq) is not reduced, it is easy to 
reduce it in polynomial time by eUminating all non-reachable process symbols. 

The following proposition characterises the finiteness of both expected work and 
expected time in terms of the spectral radius p{A) of A. (Recall that p{A) is the largest 
absolute value of the eigenvalues of A.) 

Proposition 11. Let {S, Xq) be a reduced branching process. Let A be the associated 
characteristic matrix. Then the following statements are equivalent: 

(1 ) EWxo finite; (2) ET^o is finite; (3) p{A) < 1 . 

Further, ifEWx^ is finite, then it equals the Xo-component of {I — A)~^ ■ 1, where I 
is the identity matrix, and 1 is the column vector with all ones. 

Statements similar to Proposition 1 1 do appear in the standard branching process liter- 
ature [18, 2], however, not explicitly enough to cite directly or with stronger assump- 
tions'. Our proof adapts a technique which was developed in [5] for a different purpose. 
It uses only basic tools and Perron-Frobenius theory, the spectral theory of nonnegative 
matrices. Proposition 1 1 has the following consequence: 

Corollary 12. Consider a branching process with process symbols F and Xq G F. 
Then ¥JWxo '^^d ETxo are both finite or both infinite. Distinguishing between those 
cases is in P. 

By combining the previous results we obtain the following theorem. 

Theorem 13. Consider a pSJS S with alphabet E = FuQ. Let a € F. Then EVS^ and 

ETa (^re both finite or both infinite. Distinguishing between those cases is in PSPACE, 
and PosSLP-Zzard even for pPDSs. Further, if S is normalised and EW^ is finite, one 
can efficiently express E'\^. 

Theorem 13 can be interpreted as saying that, although the pSJS model does not impose 
a bound on the number of active processes at a time, its parallehsm cannot be used to do 
an infinite expected amount of work in a finite expected time. However, the "speedup" 
E [WI /E [T] may be unbounded: 

V 

Proposition 14. Consider the family of branching processes with transitions X ^ 

{XX) and X ±, where Q < p < 1/2. Then the ratio E \Wx] /E [Tx] is un- 

bounded for p — > 1/2. 

' For example, [2] assumes that there is n € N such that A" is positive in all entries, a restriction 
which is not natural for our setting. 
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4 Case Studies 



We have implemented a prototype tool in the form of a Maple worksheet, which allows 
to compute some of the quantities from the previous section: the termination probabil- 
ities, and distributions and expectations of work and time. In this section, we use our 
tool for two case studies^, which also illustrate how probabiUstic parallel programs can 
be modelled with pSJSs. We only deal with normalised pSJSs in this section. 

4.1 Divide and Conquer 

The pSJS model lends itself to analyse parallel divide-and-conquer programs. For sim- 
plicity, we assume that the problem is already given as a binary tree, and solving it 
means traversing the tree and combining the results of the children. Figure 2 shows 
generic parallel code for such a problem. 



function divCon(node) 

if node.leafO then return node.val() 

else parallel { vail := divCon(node.cl), val2 := divCon(node.c2) ) 
return coinbine(vall, val2) 

Fig. 2. A generic parallel divide-and-conquer program. 

For an example, think of a routine for numerically approximating an integral 

Jo /(•^) Given the integrand / and a subinterval / C [0, 1], we assume that there 
is a function which computes osc / (/) £ N, the "oscillation" of / in the interval /, a 
measure for the need for further refinement. If osc f (7) = 0, then the integration routine 
returns the approximation 1/(1/2), otherwise it returns Ii + I2, where Ji and I2 are 
recursive approximations of J^^^ f{x) dx and J^^^ f{x) dx, respectively.-' 

We analyse such a routine using probabilistic assumptions on the integrand: Let 
n,ni,n2 be nonnegative integers such that < ni+n2 < n. If oscf{[a,h]) = 
n, then oscf{[a,{a+b)/2]) = rii and oscj{[{a+b) /2,b]) = n2 with probability 
xin,n,,n2) := CJ • {^-^^) ■ (f)"^ • (f)'^^ • where < p < 1 is 

some parameter.* Of course, other distributions could be used as well. The integration 
routine can then be modelled by the pSJS with Q = {q} and P = { {qq) , 0, . . . , rimax } 
and the following rules: 

1 ; \ 1 x{n,ni,n2) 

^ q and \qq) ^ q and n ' > (m 712) for all 1 < n < rimax , 

^ Available at http : / /www. comlab. ox . ac .uk /people /Stefan .Kief er/ case-studies .mws. 

' Such an adaptive approximation scheme is called "local" in [21], 
That means, the oscillation n in the interval [a, h] can be thought of as distributed between 
[a, (a + b)/2] and [(a + b) /2, b] according to a ball-and-urn experiment, where each of the n 
balls is placed in the [a, {a+h) /2]-um and the [{a+h) /2, 6]-um with probability p/2, respec- 
tively, and in a trash urn with probability 1— p. 
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where < ni+n2 < n. (Since we are merely interested in the performance of the 
algorithm, we can identify all return values with a single synchronisation state q.) 

Using our prototype, we computed E [Wn] and E [T„] for p = 0.8 and n = 
0, 1, . . . , 10. Figure 3 shows that E ['VS^] increases faster with n than E [T„]; i.e., the 
parallelism increases. 




1 2 3 4 5 6 7 8 9 10 



Fig. 3. Expectations of time and work. 



4.2 Evaluation of Game Trees 

The evaluation of game trees is a central task of programs that are equipped with "ar- 
tificial intelligence" to play games such as chess. These game trees are min-max trees 
(see Figure 4): each node corresponds to a position of the game, and each edge from a 



<3 



<2 



>3j 



>4 



3 2 2 3 



3 3 1 



2 11 



Fig. 4. A game tree with value 3. 



parent to a child corresponds to a move that transforms the position represented by the 
parent to a child position. Since the players have opposing objectives, the nodes alter- 
nate between max-nodes and min-nodes (denoted V and A, respectively). A leaf of a 
game tree corresponds either to a final game position or to a position which is evaluated 
heuristically by the game-playing program; in both cases, the leaf is assigned a number. 
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Given such a leaf labelling, a number can be assigned to each node in the tree in the 
straightforward way; in particular, evaluating a tree means computing the root value. 

In the following, we assume for simphcity that each node is either a leaf or has 
exactly three children. Figure 5 shows a straightforward recursive parallel procedure 
for evaluating a max-node of a game tree. (Of course, there is a synometrical procedure 
for min-nodes.) 



function parMax(node) 

if node.leafO then return node.val() 

else parallel ( vail := parMin(node.cl), val2 := parMin(node.c2), val3 := parMin(node.c3) ) 
return mcix{vall, val2, val3} 

Fig. 5. A simple parallel program for evaluating a game tree. 

Notice that in Figure 4 the value of the root is 3, independent of some missing leaf 
values. Game-playing programs aim at evaluating a tree as fast as possible, possibly 
by not evaluating nodes which are irrelevant for the root value. The classic technique 
is called alpha-beta pruning: it maintains an interval [a. /?] in which the value of the 
current node is to be determined exactly. If the value turns out to be below a or above /3, 
it is safe to return a or /3, respectively. This may be the case even before all children have 
been evaluated (a so-called cut-off). Figure 6 shows a sequential program for alpha-beta 
pruning, initially to be called "seqMax(root, — oo, -|-oo)". Applying seqMax to the tree 
from Figure 4 results in several cut-offs: all non-labelled leaves are pruned. 



function seqMax(node, a, /3) 

if node.leafO then if node.val() < a then return a 

elsif node.valO > /3 then return /? 
else return node.val() 
else vail := seqMin(node.cl, a, p) 
if vail = /3 then return fi 
else val2 := seqMin(node.c2, vail, ji) 
if val2 = fi then return fi 
else return seqMin(node.c3, val2, P) 

Fig. 6. A sequential program for evaluating a game tree using alpha-beta pruning. 

Although alpha-beta pruning may seem inherently sequential, parallel versions have 
been developed, often involving the Young Brothers Wait (YBW) strategy [17]. It relies 
on a good ordering heuristic, i.e., a method that sorts the children of a max-node (resp. 
min-node) in increasing (resp. decreasing) order, without actually evaluating the chil- 
dren. Such an ordering heuristic is often available, but usually not perfect. The tree in 
Figure 4 is ordered in this way. If alpha-beta pruning is performed on such an ordered 
tree, then either all children of a node are evaluated or only the first one. The YBW 
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method first evaluates the first child only and hopes that this creates a cut-off or, at 
least, decreases the interval [a, If the first child fails to cause a cut-off, YBW spec- 
ulates that both "younger brothers" need to be evaluated, which can be done in parallel 
without wasting work. A wrong speculation may affect the performance, but not the 
correctness. Figure 7 shows a YBW-based program. Similar code is given in [8] using 
Cilk, a C-based parallel programming language. 



function YBWMax(node, a, /3) 

if node.leafO then if node.val() < a then return a 

elsif node.valO > 13 then return (3 
else return node.valQ 
else vail := YBWMin(node.cl, a, /3) 
if vail = 13 then return 5 

else parallel ( val2 := YBWMin(node.c2, vail, /3), val3 := YBWMin(node.c3, vail, P) ) 
return max{val2, val3} 

Fig. 7. A parallel program based on YBW for evaluating a game tree. 

We evaluate the performance of these three (deterministic) programs using prob- 
abilistic assumptions about the game trees. More precisely, we assume the following: 
Each node has exactly three children with probability p, and is a leaf with probabil- 
ity 1 —p. A leaf (and hence any node) takes as value a number from N4 := {0,1,2,3,4}, 
according to a distribution described below. In order to model an ordering heuristic 
on the children, each node carries a parameter e e N4 which intuitively corresponds 
to its expected value. If a max-node with parameter e has children, then they are 
min-nodes with parameters e, eQl, eQ2, respectively, where aQb := max{a— 6, 0}; 
similarly, the children of a min-node with parameter e are max-nodes with parame- 
ters e, e©l, effi2, where a®6 := min{a+5, 4}. A leaf-node with parameter e takes 
value k with probabihty (^) • (e/4)'' • (1— e/4)^~''; i.e., a leaf value is binomially 
distributed with expectation e. One could think of a game tree as the terminal tree of 
a branching process with r = {Max{e),Min{e) \ e G {0, . . . , 4}} and Q = N4 

and the rules Max{e) ^ {Min{e) Mm(eel) Min{eQ2)) and Max{e) ^^^^ k, with 
x{k) := (l-p)-(^)-(e/4)'=-(l-e/4)''-'=foralle,fc G N4, and similar rules for Mm(e). 

We model the YBW-program from Figure 7 running on such random game trees by 
the pSJS with Q ^ {0, 1, 2, 3, 4, g(V), (7(A)} U {q{a, /3, V, e),q{a, /?, A, e) | < a < 
/3<4, 0<e<4}U {q{a, 6) | < a, 6 < 4} and the following rules: 

Max{a,p,e)^ > a, Max{a,p,e)^ v p, Max{a,p,e)' vk 

Max{a, P, e) A- {Min{a, /3, e) q{a, /3, V, eel)) 

{13 q{a, /3, V, e)) A /?, (7 q{a, p, V, e)) A {Max2{^, /3, e) g(V)) 

Max2{a, /3, e) A {Min{a, /3, e) Min{a, /3, eel)) 

(a b) A q{a, b), {q{a, b) q{V)) A- max{a, b} , 
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where 0<a<7</3<4 and a < k < /3 and < e < 4 and < a, 6 < 4. There are 
analogous rules with Min and Max exchanged. Notice that the rules closely follow the 
program from Figure 7. The programs parMax and seqMax from Figures 5 and 6 can 
be modelled similarly. 

Let T(YBW,p) := E [TMaa;(o,4,2) I Run{Max{0, 4, 2)^2)] ; i.e., T(YBW,p) is the 
expected time of the YBW-program called with a tree with value 2 and whose root is a 
max-node with parameter 2. (Recall that p is the probability that a node has children.) 
Let T4^(YB W, p) defined similarly for the expected work, and define these numbers also 
for par and seq instead of YBW, i.e., for the programs from Figures 5 and 6. Using our 
prototype we computed T4^(seq,p) = 1.00, 1.43, 1.96, 2.63, 3.50, 4.68, 6.33 for p = 
0.00,0.05,0.10,0.15,0.20,0.25,0.30. Since the program seq is sequential, we have 
the same sequence for T(seq,p). To assess the speed of the parallel programs par and 
YBW, we also computed the percentaged increase of their runtime relative to seq, i.e., 
100 • (r(par, p)/r(seq, p) — 1), and similarly for YBW. Figure 8 shows the results. One 



+50%: 
+40%: 
+30%: 




Fig. 8. Percentaged runtime increase of par and YBW relative to seq. 

can observe that for small values of p (i.e., small trees), the program par is slightly faster 
than seq because of its parallelism. For larger values of p, par still evaluates all nodes in 
the tree, whereas seq increasingly benefits from cut-offs of potentially deep branches. 
Using Proposition 11, one can prove W^(par, ^) = T(par, |) = oo > H^(seq, 5).^ 
The figure also shows that the YBW-program is faster than seq: the advantage of YBW 
increases with p up to about 10%. 

We also compared the work of YBW with seq, and found that the percentaged in- 
crease ranges from to about +0.4% for p between and 0.3. This means that YBW 
wastes almost no work; in other words, a sequential version of YBW would be almost 
as fast as seq. An interpretation is that the second child rarely causes large cut-offs. Of 

^ In fact, ly (seq, p) is finite even for values of p which are slightly larger than | ; in other words, 
seq cuts off infinite branches. 
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course, all of these findings could depend on the exact probabilistic assumptions on the 
game trees. 

5 Conclusions and Future Work 

We have introduced pSJSs, a model for probabiUstic parallel programs with process 
spawning and synchronisation. We have studied the basic performance measures of ter- 
mination probability, space, work, and time. In our results the upper complexity bounds 
coincide with the best ones known for pPDSs, and the lower bounds also hold for 
pPDSs. This suggests that analysing pSJSs is no more expensive than analysing pPDSs. 
The pSJS model is amenable to a practical performance analysis. Our two case studies 
have demonstrated the modelling power of pSJSs: one can use pSJSs to model, analyse, 
and compare the performance of parallel programs under probabilistic assumptions. 

We intend to develop model-checking algorithms for pSJSs. It seems to us that a 
meaningful functional analysis should not only model-check the Markov chain induced 
by the pSJS, but rather take the individual process "histories" into account. 

Acknowledgements. We thank Javier Esparza, Alastair Donaldson, Markus Miiller-Olm, 
Luke Ong and Thomas Wahl for helpful discussions on the non-probabilistic version of 
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A Proofs of Section 3.1 



Here is a restatement of Proposition 4. 

Proposition 4. There is a probability-preserving bijection between the runs Run{a\.q) 
in Ms and the runs Runi^a]^) in Msp. In particular, we have [aiq] = [Da\.q]. 

Proof. We define a bijection b : Run[Ms]{(Tiq) Run[Mgj,]{Da-lq). Since these 
sets only contain runs that reach a terminal state (namely, q and q, respectively), we 
identify in this proof a run with the (finite) path that leads to the terminal state. For a 
path w in AIs,, with length n we write w\ a for the path z of length n with z{i) = w{i)a 
forallO < i < n - 1. 

Let w e Run[Ms] {(Jiq). We define b{w) inductively by order of the length n of w. 
The run w has one of the following forms: 

- Let w = w{0), where w{0) = a = q. Then we set b{w) := Dq, q. 

- Let w = a, {(Ti(T2), . . . , w{k), . . . , q, where a G F and w{k) = ((71(72) for 
some k with 1 < fc < n — 2, such that w{i) ^ T for z G {1, . . . ,k — 1}. Consider 
the left and the right subtree of the root nodes in the run between w(l) = {aia2) 
and w{k) = ((71 (72)- By the semantics of pSJSs, there are corresponding runs Ui = 
ai, . . . , qi and U2 — CT2, . . . , 52 such that ui e Run[Ms]{criiqi) and U2 G 
Run[Ms]{(T2iq2) and max{|Mi|, |u2|} = k. Let zi := b{ui) and Z2 '■= b{u2) and 
Z3 := b{w{k), u>(n — 1)). Then we set := Da, ziJ(T2, Z2\qi, 2:3. 

- Let w = a, cTi, . . . , q, where a G F and ai G S \ {QQ). Then we set b{w) := 
□a, b{ai, q). 

Notice that in all cases b{w) G Run[Msp\i^(jiq). It is easy to check that 6 is a bi- 
jection. It is also easy to see that V ({w}) = V ({&(uj)}), because in both cases the 
probabihty is the product of the probabilities of the apphed transition rules (of course, 
taking multiplicities into account). □ 



B Proofs of Section 3.2 

B. 1 Proof of Proposition 5 

Here is a restatement of Proposition 5. 

Proposition 5. Let a <E S and q G Q. Then [(T\.q] is the value for \cf\rq\ in the least 
(w.r.t. componentwise ordering) nonnegative solution of the above equation system. 

Proof. The proposition can be proved by adapting the corresponding proofs for pPDSs 
from [12] or [16]. Alternatively, we can use the "serialisation" procedure from Sec- 
tion 3.1 and the equality [(j\.q\ = [DcrJ,^], reducing the problem from pSJSs to pPDSs. 
Then we can take the equation system from [12, 16] for [□diq] and compress it by 
substituting variables with the right-hand side of their equation. This gives the same 
equation system as the one above. □ 
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B,2 Proof of Theorem 6 



Here is a restatement of Theorem 6. 

Theorem 6. Consider a pSJS with alphabet S = F U Q. Let a G E and q £ Q. Then 
(1) one can efficiently express ( in the sense defined in Section 2 ) the value of [(j].q], (2) 
deciding whether [a^q] = is in P, and (3) deciding whether [alq] < 1 is PosSLP-hard 
even for pPDSs. 

Proof. The respective claims are shown for pPDSs in [14, 16]. For statements (1) 
and (2), we use Proposition 4 to reduce the problem to pPDSs. For statement (3), recall 
from Section 3.1 that pPDSs can be encoded as pSJSs. □ 

C Proofs of Section 3.3 

C.l Proof of Lemma 7 

Here is a restatement of Lemma 7. 

Lemma 7. The set U can be computed in polynomial time. 

Proof. We write t\ — >■* if can be reached from t\ in the Markov chain Ms induced 
by the pSJS S; i.e., — >* is the reflexive and transitive closure of Define ^ := 
n (Z" X S); i.e., we have ai => (J2 if and only if (T2 G can be reached from 
(7i G S. The relation ^ can be computed in polynomial time using the fact that it is 
the smaUest subset of S x S that satisfies: 

- a cr for all <7 G S; 

- (71 (72 => CTs imphes cti (T3; 

- (71 {(T2(y?,) and [CT2i(3'2] > and [cralqa] > and (5293) ^ cr4 imply ui ^ (74. 

For a tree t € T{E), let h{t) denote its height, i.e., the maximal distance of a leaf to the 
root. Moreover, we define for each fc e N: 

Uk ■■= {cr e S \ there is a tree t e T{a) with a ^* t and h{t) > k} . 

Notice that E = Uq ^ Ui ^ U2 ■ ■ - Itis easy to see that we have for aU fc € N: 

Uk+i = {a G r \ 3 transition b (a'icr2) : a 6 and {ai, (72} fl f/fe ^ 0} . 

It follows from this characterisation that the sequence C/q 2 C^i 2 • • • stabilises after 
at most li^l steps; i.e., we have U\z;\ = ^^i^i+i = • • • In other words, we have: 

C/|^l = {o e r I Vn e N : 3f G T{E) : a f and h{t) > n} . 

For each tree t € T{E) we have h{t) < \t\ < 2''(*) (recall that \t\ is the length of t 
not counting the symbols '(' and ')'). Therefore, we have U = U\jji, so it suffices to 
compute t/|i;|, which can be done in polynomial time with the above characterisation. 

□ 



18 



C.2 Proof of Proposition 8 



Here is a restatement of Proposition 8. 

Proposition 8. (1) The pSJS S is normalised; (2) the value \a\,q] for a e F and 
q £ Q is the same in S and S; (3) we have V (So < oo = | Run{a)) = [a]^] for 
all ae r. 

Proof. Statement (1) follows from the fact that S has, by construction, for all qi,q2 & Q 
a transition with {qi , 92) on the left hand side. (In other words, F D Q x Q.) 

For statement (2) observe that S is obtained from S via two steps: a normalisation 
step, and a step where transitions with process symbols 6 G iJ on the left hand side are 
replaced. We argue that neither of those steps modifies the value [a\.q\ for a e .T and 
q E Q. The first step does not modify [a.lq], because the normalisation affects only those 
runs that would not terminate in a synchronisation state without normahsation. With 
normalisation, those runs may terminate in the new synchronisation state introduced 
by the normalisation, but not in q. The second step does not modify [alq], because it 
affects only those runs that would otherwise not terminate. With the modification in the 
second step, those runs may terminate in q, but not in q. 

For statement (3), it is convenient to consider a version of S "in between" the first 
and the second modification step. More precisely, let S""™ denote the pSJS after the 
normalisation step, and let S^*^* denote the pSJS obtained from S^°™ by removing all 
transitions with symbols b G B on the left hand side and replacing them with a transition 

b^b. Notice that P (Sa < 00 = T„) is the same in S and 5"°™ and S"*'^' for aU a e F. 
Denote by Good{a) the set of those runs w G Run[Ms'i''i]{a) that reach a bottom 
strongly connected component (BSCC) of Msuai. For any n G N there are only finitely 
many trees t of Mssmt such that \t\ < n. Hence it follows using standard arguments on 
finite Markov chains that in we have for all n G N 

V{Sa<n)=V{Sa<n?i\\dGood{a)) , and so 

V (So < n and = 00) = V (Sa < n and = cx) and Good{a)) , and so 

P (Sa < 00 = Ta) = P (Sa < 00 = Xx and Good{a)) . (1) 

Observe that each BSCC of Mgaai consists of exactly one tree t such that Front{t) is 
either empty or consists only of elements of B. So there is a natural 1-to-l correspon- 
dence between the runs of S^^^ that satisfy "Sa < 00 = Ta and Good{a)" and the runs 
of S that are in Run{aiq). With (1) we get (Sa < 00 = Ta) = [aiq]. □ 

C,3 Proof of Tlieorem 9 

Here is a restatement of Theorem 9. 

Theorem 9. Consider a pSJS with alphabet S ^ F U Q and a G F. Let s := 
P {Sa < 00). Then ( 1 ) one can efficiently express s, (2) deciding whether s = OisinP, 
and (3) deciding whether s <lis PosSLP-hard even for pPDSs. 
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Proof. It follows from Proposition 8 that s = J2qeQi'^-^l\' hence s is expressible by 
Theorem 6(1). Moreover, it follows using Theorem 6 (2) that deciding whether s = 
is in P. It remains to show statement (3). 

We draw from a reduction in [16], where it is shown that deciding if a pPDS ter- 
minates with probability 1 is PosSLP-hard. As a gadget for that, Etessami and Yan- 
nakakis [16] compute, given a PosSLP instance, a pPDS S with the following proper- 
ties. (More precisely, they construct an equivalent recursive Markov chain.) The starting 
configuration is qa, and after having left the initial configuration, S reaches a configura- 
tion of the form qa with a G F* again with probabihty 1 . At that time, the configuration 
is qaa with some probability p, and q with probability 1 — p. Moreover, the time (and 
hence space) needed to reach either of those configurations is essentially bounded by 
the size of the given PosSLP instance, so it is finite. Furthermore, the given PosSLP 
instance is a "yes instance" if and only if p > i. It is easy to see that V {Sqa = oo) > 

in S if and only if V {Sga = oo) > in the pPDS S that consists only of the transitions 

p i~p — 
qa qaa and qa " > q. The Markov chain induced by S, in turn, is isomorphic to the 

simple random walk Xq, Xi, . . . on N with Xq = 1 and V (Xj+i = | Xj = 0) = 1 

and 

r {Xi+i = n + 1 \ X, = n > 0) = p and 
r{X,+i=n-l \X,=n>0) = l-p. 

It is well-known (see e.g. [25]) that for this random walk we have 
■p (supjgfj = oo) > if and only if p > |. It follows that we have 
■p(S,a < oo) < tins' if and only if p > i. □ 

D Proofs of Section 3.4 

D.l Proof of Proposition 10 

Here is a restatement of Proposition 10. 

Proposition 10. Let a € S and q € Q with [aiq] > 0. Then 

V {'W„ = n I Run{(jlq)) = V {"Wf^crq^) = n \ Run{l\aq\))) for all n > and 
ViT^ <n\ Run{aiq)) < V (T(|<,qD < n \ Run{<\aq\j)) for all n>0. 

In particular, we have [^uq^]^ = 1. 

Proof Define, for g e Q and a G E, 

Daq{n) := V {Run{(j\.q), W^, = n \ Run{a)) and 
V (W^,,^ - n I i?«n(M)) . 

Notice that (|agD and thus Di^^q\^{n) are undefined, if and only if [aiq\ = 0. For the 
arithmetical expressions in the rest of this proof, we assume ■ undefined = 0. For the 
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statement on Win the proposition, it suffices to show Daq{n) = [<yiq] ■ D()^„q^{n) for 
n > 0. We proceed by induction on n. 

Let n = 0. If [aiq\ = 0, then D„q{Q) = = 0- undefined = [aiq] ■ D^^^^ (0). If 
[aiq] > and cr = g, then ^^^(O) = 1 = 1-1 = [aiq] ■ D_l{0) = [aiq] ■ D^^g^ {0). If 
[aiq] > and (T e then ^^^(O) = = [aiq] ■ = [aiq] ■ D(^„q^{0). 

Let n > 0. If [aiq] = 0, then D„q{n) =0 = 0- undefined — [aiq] ■ Di^„q'^{n). If 
[aiq] > and a = q, then D„q{n) =0 = 1-0= [aiq] ■ D^^qt^ (n). If [aiq] > and 
a = a € r, then 



Daq{n) = 



E 



a^((Ti(T2) 

(9i92>ern(Q(3> 



E P-D<^in{i)-D'^2q2{j)-D(q,q^)q{k) 

1+i+j+k—n 



E 



E ■ ■ [<^2i92] • [(9192)4-9] 



a^(o"lO"2) 



l+i+j+fc=n 



(9i92>ern(gQ) " 

■ ■ £"(1^2921) (i) ■ ^(|{<Ji<J2)9D(^) 



[a4-9] 



E 



E 

\(|agD'— 7-((l<TigiD(|CT2g2N(9i92>gD> 



P' ■ ^(l<TigiD(i) • -D(|<T2g2D(i) • -C'(l<gig2><zD(^) 



da.jD'^dtr'gD 

= [o4g] ■ (n) , 

where the first and the last equality are by the definition of Wand the semantics of 
pSJSs, the second equality is by the induction hypothesis, and the third equaUty is by 

the definition of F. This proves the statement on Win the proposition. 

The proof of the statement on T is similar. Define, for q € Q and a ^ E, 



Eaq{n) 



= V (Run{aiq), < n [ Run{a)) and 
= V {Run{aiq), = n j Run{a)) and 
- V (Tj,,j < n I Runi^aq))) . 



Notice that <\aq\j and thus -E(i<7gD {n) are undefined, if and only if [ct4-9] = 0. It suffices 
to show E„q{n) < [aiq\ ■ ^^daqD {n) for n > 0. We proceed by induction on n. 
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Let n = 0. If [aiq] = 0, then £',,,(0) =0 = 0- undefined = [aiq] ■ E,^^q^ (0). If 
[aiq] > and cr = g, then £;^q(0) = 1 = 1 • 1 = [ctIq] • E^{Q) = [a^q] ■ -Ej<,,j(0). If 
[aiq] > and cr G then E^^iQ) = = [aiq] ■ = Kg] • E^^^^ (0). 

Let n > 0. If [(j],q\ = 0, then E„q{n) = = 0- undefined = [(j],q\ ■ E(^^q]^{n). If 
[(j\.q\ > and a = q, then E„q{n) = 1 = 1-1 = [a\.q\ ■ Ei^„q^ (n). If [alq] > and 
a = a € r, then 

+ ^ p- E„,q{n - 1) 

p 

a'eS\{QQ) 

+ ^ p • £<,/g(n - 1) 
<T'ei:\<QQ) 

E --E^t^igi ("-!)■ -^0-292 (»^-l)--£^(«ig2><z("-l) 

p 

a'^(o-i<72> 

(9ig2>ern(QQ> 



< 



E P ■ ['^i^Q'i] ■ [o-2-i92] • [{qiq2)iq\ 



a"-^ {010-2) 

{gig2)ern{QQ) 



■ E(iaiqi\,{n - 1) • E^^^q^^in - 1) • £^^(qi52>«j(" 

+ E P-Wiq]- E(^„,q^{n-1) 



a'6i:\(QQ> 



a. 



49] 



Vda-Jl) ( (1<T1 91 N<^2 92 1) (1 (91 92 > 9D> 

(la9|)^(l<T'9|) / 
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where the first and the last equality are by the definition of T and the semantics of 
pSJSs, the first inequahty and the second equality are trivial, the second inequahty is by 
the induction hypothesis, and the third equality is by the definition of F. 
For the final statement, observe that 

[dagO;] = lim V (TJ^„^ < n \ Run{i\aq^)) > lim V {T„ < n \ Run{aiq)) = 1 . 

□ 



D.2 Proof of Proposition 11 

Here is a restatement of Proposition 1 1 . 

Proposition 11. Let {S, Xq) be a reduced branching process. Let A be the associated 
characteristic matrix. Then the following statements are equivalent: 

(1 ) EWxo finite; (2) ETx^ is finite; (3) p{A) < 1 . 

Further, i/EWjfg is finite, then it equals the Xq- component of {I — A)^^ ■ 1, where I 
is the identity matrix, and 1 is the column vector with all ones. 

Proof. Clearly, statement (1) implies statement (2), because, by definition, 
Wjfo ^ . Next, we prove that statement (3) implies statement (1). For each X G F 
and i G N, we define a random variable z^' over Run{Xo) by setting z'^^^w) := 
\ Front {w(i))\x', i.e., z^-^ is the number of active X-processes at time i. We assemble 
the z^^ in a row vector z^^\ Note that \ Front (w{i))\ ~ ||z(*)(w)||^. It is easy to see 
(by induction, see also [2, p. 184]) that E [z^')] = e^^°^ ■ A\ where by e^^") € we 
mean the row vector whose only nonzero component is the Xo-component, which is 1. 
Consequently, we have 



= ^E [|Froni(u;(i))| | w e Run[XQ)\ = ^ ||e 

i=0 i=0 



Z« 



where A* := X^^q A'\ It is known [19] that the matrix series A* converges if and only 
if p{A) < 1. It follows that statement (3) implies statement (1). Further, it is known [19] 
thatif/9(A) < 1, then A* = (/-A)"!, sothenEWx^ = He^-^") • (7 - which 
equals the -component of (/ — A)~^ ■ 1. 

It remains to show that statement (2) imphes statement (3). For this part, we rely on 
Perron-Frobenius theory. Let p{A) > 1. Call a matrix B G M"^" strongly connected, 
if for all 1 < i,j < n there is fc > such that (B^)ij ^ 0. Since A is nonnegative. 
Corollary 2.1.6 of [3] asserts that there exists a strongly connected principal submatrix 
A' of A such that p{A') > 1; i.e., there is F' C F such that the matrix A' e M^'^^' 
obtained from A by deleting all rows and columns not indexed with elements of F' 
is strongly connected. We will show that ETx = oo for all X £ F'. Since {S,Xo) 
is reduced, this imphes that ETxq is infinite. Therefore, to simplify the notation, we 
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assume in the following w.l.o.g. that A = A', i.e., A is strongly connected. We will 
show ETx = 00 for all X e T. 

Define, for each i e {0, 1, 2, 3}, a function L^*) : (M^)* as foUows, where 

y,z,w G are column vectors (we use subscripts to refer to indices): 

X^± X^{YZ) 

L^^\y)x--^ Y P-Vy; L^^\y,z,w)x ■■= ^ p-yy-zz-ww 

X^Y X^{YZW) 

Notice that L'^^^ is a vector of constants, and L^^^ , L^^^ , L^^^ are linear, bilinear, trilinear 
vector functions, respectively. We write L^'^\y, •) and L^^\-, z, w) etc. to mean the 
matrices U,V e R^^^ such that U ■ x = L^'^\y, x) and V ■ x = L^^\x, z, w) etc. 
for all a; e M^. It is straightforward to verify that 

A = + L^^\l, ■) + 1) + L(3)(1, 1, ■) + L(3)(1, 1) + 1, 1) , 

where 1 denotes the vector with all ones. Let / : be the function with 

f{x) := + (a;) + L^^) (^.^ x) + L^^' (a;, x, x) . 

This "generating function" / plays a central role in the branching process literature. 
Notice that / characterises the branching process 5* up to the order of the children. 
Define := (i.e., the vector with all zeros) and := for all i G N. It 

is well-known [18] (and straightforward to show by induction on i) that V (Tx < i) = 
q"^ . Define r*^*-* := 1 — q^^\ Using either the definition of r*^*-' or the fact that r^^ = 
V (Tx > i)) one can easily check 

r('+i) = L(i)(r«) + L(2)(i^^(i)) + l(2)(^W^ i _ ^W) 

+ l,r«) + L('')(l,rW, 1 - r^) + L(3)(^W^ i _ ^(i)^ i _ ^(i)) 

- (r.(') , r(^) , 1) + L(3) (^(i) , ^ (i) , ^(0 ) . 

By defining 

B{x) := L(2)(a;, •) + L^^\l,x, •) + L^^\x, 1, •) + L(3)(a;, ., 1) 

we get 

= (^-B(rW)) -rW +L(3)(r«,r«,r(*)). (2) 

Note that A — B{x) and -B(a;) are nonnegative for x G [0, 1]^ and that B{e ■ x) = 
£ ■ B{x) for all e G M. 

In the following, for a vector x, we write a;„„„ and x^ax for the minimal and 
maximal entry of x. It is easy to see that there is s G (0, 1] such that for all X,Y G F 
we have V (Tx > n) > s - V (Ty > n) for all n G N. (For instance, take for s the 
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probability to reach, starting in X, a tree with y in at most \r\ steps. This probability 
is positive, because A is strongly connected.) It follows that > s ■ Vmlx- 

As p{A) > 1 and A is strongly connected, Perron-Frobenius theory (see [3]) asserts 
that there is a vector u G (0, 1]^, strictly positive in all components, such that A - u> 
u. (For instance, one can take the dominant eigenvector of ^4.) W.l.o.g. we can take 
Umax = s. Choose d e M+ such that 

B{l)-u<d-u. (3) 

Define a sequence {sn)nen by setting £„ := rl^L- As > s •£„ = £„ • Umax, we 
have 

>r(") >£„•«. (4) 

Observe that, since A is strongly connected, we have V (Ty > n) > for all X and 
all n, and hence £„ > e„+i > 0. If (£„)„ does not converge to 0, then there is c > 
and X e r such that P (Tx >n)>c for all n e N, akeady implying that ETy = oo. 
So we assume in the following that lim„_).oo £« = 0. In particular, there is G N 
such that End < 1 for all n>n±. Now we show, for all n > n± and alH G N, that 

r("+*) > (1 - EndYenU . (5) 

We proceed by induction on i. The induction base (i = 0) follows from (4). Let i > 0. 
We have: 

^(n+i+i) > _ B(r • r("+') (by (2)) 

>{A- B(r("+'^)) • (1 - £„rf)'£„ ■ ti (by induction hypothesis) 

> (1 - £„rf)'£„ ■ {u - B(r("+')) ■ u) (as Au > u) 

> (1 - £„rf)'£„ ■ {u - B{en •!)•«) (by (4)) 

> (1 - endfen ■ {u - End ■ u) (by (3)) 
= (1 - e„(i)*+^£„ • u 

This proves (5). Now we have for all n > nj,: 

k k 

J2 r("+') > ^(1 - Sndyenu (by (5)) 

i=0 i=a 

1 - (1 - endf+^ 



1 - (1 - End) 
l-(l-£„d)'=+l 



£„,U 



d 

so, for every n > n_L there exists some k{n) € N such that r^^^ > jg-u. Hence, 

for any X G 7^ we have 

oo oo oo k(ni_) fe(fc(nj^) + l) 

ET, = ^P(T,>^) = ^r«> ^r«= ^r«+ -x+- 

i=0 i=0 i=n± i=n± i=k{n±)-\-l 
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because ux > 0. This completes the proof. 



□ 



D.3 Proof of CoroUary 12 

Here is a restatement of Corollary 12. 

Corollary 12. Consider a branching process with process symbols F and Xq G F. 
Then ^Wxo and ETx-q are both finite or both infinite. Distinguishing between those 
cases is in P. 

Proof. By Proposition 11, the expectations E"VV^„ and ETxo are both finite or both 
infinite. To distinguish between those cases, compute in polynomial time the matrix A. 
As A is nonnegative, we have p{A) > 1 if and only if there is a nonnegative vector x 
which is nonzero in at least one component and Ax > x holds (i.e., ">" holds in all 
components), see [3]. Therefore, p{A) > 1 if and only if the linear programming (LP) 
problem "^x >x> Oand||a;||j = 1" is feasible. This can be decided in P. We remark 
that a similar method was used in [ 1 6] . □ 

D.4 Proof of Theorem 13 

Here is a restatement of Theorem 13. 

Theorem 13. Consider a pSJS S with alphabet S = FUQ. Let a G F. Then E"VS{^ and 
ETo are both finite or both infinite. Distinguishing between those cases is in PSPACE, 
and VosSLP-hard even for pPDSs. Further, if S is normalised and EWa is finite, one 
can efficiently express EWa. 

Proof. If S is not normalised, we normalise it in polynomial time (Section 3.2). This 
does not change the finiteness of E'W^ or ETa. Then we compute in polynomial time 
(Theorem 6) the set Q' := {q £ Q \ [alq] > 0}. The values [alq] for q & Q' 
can be efficiently expressed (Theorem 6). Therefore, we can also efficiently express 
K] = EgeQ'Ka'] and decide in PSPACE if [a]] = 1 or [ai] < 1. If [ai] < 1, 
then nonterminating runs have a positive probability and hence EWa = ETa = oo. 
Otherwise, we have, with Proposition 10, 



and ET„ = ^ [aiq] • E [TJ Run{aiq)] > ^ [aiq] ■ ET^^g^ =: T . (7) 



If EWa is finite^ then ETa is finite, because "W^ > Ta. If ETa is finite, then T is finite 
by (7), hence W is finite by Corollary 12, hence EW^ is finite by (6). Therefore, EWa 
is finite if and only if ETq is finite. 

In order to decide if E"VV^ is finite, by (6) it suffices to decide if EWJa^j is finite 
for all q £ Q' . We cannot use Corollary 12 directly, because the coefficients of the 
branching process from Proposition 10 are not explicitly given. However, by Theorem 6 



EWa 



^ [aiq] • E [Wa I Run{aiq)] = ^ [aiq] ■ EW^agJ =: W (6) 
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they are efficiently expressible and, therefore, so is the matrix A from Proposition 1 1 . 
So we can decide in PSPACE whether E'VVJ„qD is finite by deciding whether the formula 
"Bcc : Ax>x>Omd \\x\\^ = 1" in ExTh{R) is true (cf. the proof of Corollary 12). 

If KWa is finite, we wish to efficiently express it. We use again W from (6), so it 
suffices to show that EWi^aqi efficiently expressible for all q G Q'. For that, notice 
again that the matrix A from Proposition 11 is efficiently expressible, and consider the 
formula '\I — A)x = 1" in ExTh{R). Existentially quantify all variables in x ex- 
cept the dag'D-component. By Proposition 11, the resulting formula efficiently expresses 

It remains to show PosSLP-hardness for pPDSs. We draw from a reduction in [16], 
where it is shown that deciding if a pPDS terminates with probability 1 is PosSLP- 
hard. As a gadget for that, Etessami and Yannakakis [16] compute, given a PosSLP 
instance, a pPDS S with the following properties. (More precisely, they construct an 
equivalent recursive Markov chain.) The starting configuration is qa, and after having 
left the initial configuration, S reaches a configuration of the form qa with a G F* 
again with probability 1. At that time, the configuration is q with some probability p, 
and qaa with probability 1 — p. Moreover, the time (and hence work) needed to reach 
either of those configurations is essentially bounded by the size of the given PosSLP 
instance, so it is finite. Furthermore, the given PosSLP instance is a "yes instance" if 
and only if p > i. It is easy to see that ^'Wqa is finite in 5 if and only if E'N^^ is 

— p 
finite in the branching process S that consists only of the transitions X ^ ± and 

i-p — 
X « > {XX). The 1 X 1-matrix A for S from Proposition 11 consists of a single 

entry 2- (l—p). Consequently, EWqa is finite in S if and only if 2 • (1 — p) < 1, which 

is equivalent to p > | . This completes the reduction. We remark that the reduction does 

not show that deciding if EWis finite is PosSLP-hard for branching processes, because 

S has more control states than just q. (Recall that the problem for branching processes 

is in P by Corollary 12.) □ 



D.5 Proof of Proposition 14 

Here is a restatement of Proposition 14. 

p 

Proposition 14. Consider the family of branching processes with transitions X ^ 

{XX) and X _L, where < p < 1/2. Then the ratio E [Wx] /E [Tx] is un- 

bounded for p — > 1/2. 

Proof. By Proposition 11, we have EWsf = 1/(1 — 2p). It is shown in [24] that 
limp_^ 1 _2in(ii2p) = 1- Consequently, we have lim^^ i E \yVx] /E [Tx] = oo. □ 
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